Search CORE

3 research outputs found

Quantifying the Performance Benefits of Partitioned Communication in MPI

Author: Gillis Thomas
Guo Yanfei
Raffenetti Ken
Thakur Rajeev
Zhou Hui
Publication venue
Publication date: 11/08/2023
Field of study

Partitioned communication was introduced in MPI 4.0 as a user-friendly interface to support pipelined communication patterns, particularly common in the context of MPI+threads. It provides the user with the ability to divide a global buffer into smaller independent chunks, called partitions, which can then be communicated independently. In this work we first model the performance gain that can be expected when using partitioned communication. Next, we describe the improvements we made to \mpich{} to enable those gains and provide a high-quality implementation of MPI partitioned communication. We then evaluate partitioned communication in various common use cases and assess the performance in comparison with other MPI point-to-point and one-sided approaches. Specifically, we first investigate two scenarios commonly encountered for small partition sizes in a multithreaded environment: thread contention and overhead of using many partitions. We propose two solutions to alleviate the measured penalty and demonstrate their use. We then focus on large messages and the gain obtained when exploiting the delay resulting from computations or load imbalance. We conclude with our perspectives on the benefits of partitioned communication and the various results obtained

arXiv.org e-Print Archive

C-Coll: Introducing Error-bounded Lossy Compression into MPI Collectives

Author: Cappello Franck
Chen Zizhong
Di Sheng
Guo Yanfei
Huang Jiajun
Liu Jinyang
Raffenetti Ken
Thakur Rajeev
Yu Xiaodong
Zhai Yujia
Zhao Kai
Zhou Hui
Publication venue
Publication date: 07/04/2023
Field of study

With the ever-increasing computing power of supercomputers and the growing scale of scientific applications, the efficiency of MPI collective communications turns out to be a critical bottleneck in large-scale distributed and parallel processing. Large message size in MPI collectives is a particularly big concern because it may significantly delay the overall parallel performance. To address this issue, prior research simply applies the off-the-shelf fix-rate lossy compressors in the MPI collectives, leading to suboptimal performance, limited generalizability, and unbounded errors. In this paper, we propose a novel solution, called C-Coll, which leverages error-bounded lossy compression to significantly reduce the message size, resulting in a substantial reduction in communication cost. The key contributions are three-fold. (1) We develop two general, optimized lossy-compression-based frameworks for both types of MPI collectives (collective data movement as well as collective computation), based on their particular characteristics. Our framework not only reduces communication cost but also preserves data accuracy. (2) We customize an optimized version based on SZx, an ultra-fast error-bounded lossy compressor, which can meet the specific needs of collective communication. (3) We integrate C-Coll into multiple collectives, such as MPI_Allreduce, MPI_Scatter, and MPI_Bcast, and perform a comprehensive evaluation based on real-world scientific datasets. Experiments show that our solution outperforms the original MPI collectives as well as multiple baselines and related efforts by 3.5-9.7X.Comment: 12 pages, 15 figures, 5 tables, submitted to SC '2

arXiv.org e-Print Archive

Why is MPI so slow?

Author: Abdelhalim Amer
Akhil Langer
Alexander Sannikov
Charles Archer
Dmitry Durnov
Gengbin Zheng
Hajime Fujita
Jithin Jose
Ken Raffenetti
Lena Oden
Masamichi Takagi
Masayuki Hatanaka
Matt Otten
Michael Blocksome
Michael Chuvelev
Min Si
Misun Min
Paul Coffman
Paul Fischer
Pavan Balaji
Sangmin Seo
Sayantan Sur
Sergey Oblomov
Thilina Rathnayake
Tomislav Janjusic
Wesley Bland
Xin Zhao
Yanfei Guo
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2017
Field of study

Crossref

Juelich Shared Electronic Resources